36 research outputs found
Capacity of Sum-networks for Different Message Alphabets
A sum-network is a directed acyclic network in which all terminal nodes
demand the `sum' of the independent information observed at the source nodes.
Many characteristics of the well-studied multiple-unicast network communication
problem also hold for sum-networks due to a known reduction between instances
of these two problems. Our main result is that unlike a multiple unicast
network, the coding capacity of a sum-network is dependent on the message
alphabet. We demonstrate this using a construction procedure and show that the
choice of a message alphabet can reduce the coding capacity of a sum-network
from to close to
Privacy-Preserving Adversarial Networks
We propose a data-driven framework for optimizing privacy-preserving data
release mechanisms to attain the information-theoretically optimal tradeoff
between minimizing distortion of useful data and concealing specific sensitive
information. Our approach employs adversarially-trained neural networks to
implement randomized mechanisms and to perform a variational approximation of
mutual information privacy. We validate our Privacy-Preserving Adversarial
Networks (PPAN) framework via proof-of-concept experiments on discrete and
continuous synthetic data, as well as the MNIST handwritten digits dataset. For
synthetic data, our model-agnostic PPAN approach achieves tradeoff points very
close to the optimal tradeoffs that are analytically-derived from model
knowledge. In experiments with the MNIST data, we visually demonstrate a
learned tradeoff between minimizing the pixel-level distortion versus
concealing the written digit.Comment: 16 page
Zero-error Function Computation on a Directed Acyclic Network
We study the rate region of variable-length source-network codes that are
used to compute a function of messages observed over a network. The particular
network considered here is the simplest instance of a directed acyclic graph
(DAG) that is not a tree. Existing work on zero-error function computation in
DAG networks provides bounds on the \textit{computation capacity}, which is a
measure of the amount of communication required per edge in the worst case.
This work focuses on the average case: an achievable rate tuple describes the
expected amount of communication required on each edge, where the expectation
is over the probability mass function of the source messages.
We describe a systematic procedure to obtain outer bounds to the rate region
for computing an arbitrary demand function at the terminal. Our bounding
technique works by lower bounding the entropy of the descriptions observed by
the terminal conditioned on the function value and by utilizing the
Schur-concave property of the entropy function. We evaluate these bounds for
certain example demand functions.Comment: 18 pages, 2 figures, submitted to IEEE Transactions on Information
Theor
MaxGap Bandit: Adaptive Algorithms for Approximate Ranking
This paper studies the problem of adaptively sampling from K distributions
(arms) in order to identify the largest gap between any two adjacent means. We
call this the MaxGap-bandit problem. This problem arises naturally in
approximate ranking, noisy sorting, outlier detection, and top-arm
identification in bandits. The key novelty of the MaxGap-bandit problem is that
it aims to adaptively determine the natural partitioning of the distributions
into a subset with larger means and a subset with smaller means, where the
split is determined by the largest gap rather than a pre-specified rank or
threshold. Estimating an arm's gap requires sampling its neighboring arms in
addition to itself, and this dependence results in a novel hardness parameter
that characterizes the sample complexity of the problem. We propose elimination
and UCB-style algorithms and show that they are minimax optimal. Our
experiments show that the UCB-style algorithms require 6-8x fewer samples than
non-adaptive sampling to achieve the same error
Learning Nearest Neighbor Graphs from Noisy Distance Samples
We consider the problem of learning the nearest neighbor graph of a dataset
of n items. The metric is unknown, but we can query an oracle to obtain a noisy
estimate of the distance between any pair of items. This framework applies to
problem domains where one wants to learn people's preferences from responses
commonly modeled as noisy distance judgments. In this paper, we propose an
active algorithm to find the graph with high probability and analyze its query
complexity. In contrast to existing work that forces Euclidean structure, our
method is valid for general metrics, assuming only symmetry and the triangle
inequality. Furthermore, we demonstrate efficiency of our method empirically
and theoretically, needing only O(n log(n)Delta^-2) queries in favorable
settings, where Delta^-2 accounts for the effect of noise. Using crowd-sourced
data collected for a subset of the UT Zappos50K dataset, we apply our algorithm
to learn which shoes people believe are most similar and show that it beats
both an active baseline and ordinal embedding.Comment: 21 total pages (8 main pages + appendices), 7 figures, submitted to
NeurIPS 201
Data-driven Privacy-Preserving Communication
A communication system including a receiver to receive training data. An input interface to receive input data coupled to a hardware processor and a memory. The hardware processor is configured to initialize the privacy module using the training data. Generate a trained privacy module, by iteratively optimizing an objective function. Wherein for each iteration the objective function is computed by a combination of a distortion of the useful attributes in the transformed data and of a mutual information between the sensitive attributes and the transformed data. Such that the mutual information is estimated by the auxiliary module that maximizes a conditional likelihood of the sensitive attributes given the transformed data. Receive the input data via the input interface. Apply the trained privacy module on the input data to produce an application specific transformed data. A transmitter to transmit the application specific transformed data over a communication channel